ratio of variances
0.9706621
As shown in Listing 11-4, the p value on the F test is 0.4684. As a rule of thumb:
If
, you would assume equal variances.
If
, you would assume unequal variances.
In this case, because the p value is greater than 0.05, equal variances can be assumed, and these data
would qualify for the classic Student t test. As described earlier, R gets around this by always using
the Welch’s t test, which accommodates both unequal and equal variances.
Assessing the ANOVA
In this section, we present the basic concepts underlying the analysis of variance (ANOVA), which
compares the means of three or more groups. We also describe some of the more popular post-hoc
tests used to follow a statistically significant ANOVA. Finally, we show you how to run commands to
execute an ANOVA and post-hoc tests in R, and interpret the output.
Grasping how the ANOVA works
As described earlier in “Surveying Student t tests,” it is only possible to run a t test on two groups.
This is why we demonstrated the t test comparing married NHANES participants (M) to all other
marital statuses (OTH). We were testing the null hypothesis M – OTH = 0 because we were only
allowed to compare two groups! So when comparing three groups, such as married (M), never married
(NM), and all others (OTH), it’s natural to think of pairing up the groups and running three t tests
(meaning testing M – NM, then testing M – OTH, then testing NM – OTH). But running an exhaustive
set of two-group t tests increases the likelihood of Type I error, which is where you get a statistically
significant comparison that is just by chance (for a review, read Chapter 3). And this is just with three
groups!
The general rule is that N groups can be paired up in
different ways, so in a
study with six groups, you’d have
, or 15 two-group comparisons, which is way too
many.
The term one-way ANOVA refers to an ANOVA with only one grouping variable in it. The grouping
variable usually has three or more levels because if it has only two, most analysts just do a t test. In an
ANOVA, you are testing how spread out the means of the various levels are from each other. It is not
unusual for students to be asked to calculate an ANOVA manually in a statistics class, but we skip that
here and just describe the result. One result derived from an ANOVA calculation is expressed in a test
statistic called the F ratio (designated simply as F). The F is the ratio of how much variability there is
between the groups relative to how much variability there is within the groups. If the null hypothesis is
true, and no true difference exists between the groups (meaning the average fasting glucose in M = NM
= OTH), then the F ratio should be close to 1. Also, F’s sampling fluctuations should follow the
Fisher F distribution (see Chapter 24), which is actually a family of distribution functions
characterized by the following two numbers seen in the ANOVA calculation:
The numerator degrees of freedom: This number is often designated as
or
, which is one